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Abstract. We use PostgreSQL DBMS for storing XML metadata, de- 
scribed by the IVOA Characterisation Data Model. Initial XML type 
support in the PostgreSQL has recently been implemented. We make 
heavy use of this feature in order to provide comprehensive search over 
Characterisation metadata tree. We built a prototype of the Character- 
isation metadata query service, implementing two access methods: (1) 
HTTP-GET/POST based interface implements almost direct translation 
of the query parameter name into XPath of the data model element in 
the XML serialisation; (2) Web-Service based interface to receive XQuery 
which is also directly translated into XPath. This service will be used in 
the ASPID-SR archive, containing science-ready data obtained with the 
Russian 6-m telescope. 



1. Introduction 

Storing and querying structured metadata is an important point for building 
astronomical archives containing heterogeneous datasets coming from various 
telescopes and/or instruments. IVOA Characterisation Data Model (McDowell 
et al., in prep.) allows to describe a position of any astronomical dataset in the 
multidimensional space of physical parameters. Thus, building query interface 
on top of the characterisation metadata, will provide enough capabilities for 
elaborated queries often needed for sophisticated scientific usage of the resource. 
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<!-- description --> 

The world's <em>most advanced</em> <br l> 
open source database 

XML fragments 




datasheet 



pk ds_id 
ds_data 

ds created 



INT4 
XML 

TIMESTAMP 



office 

PK o_id INT4 
u o_name VARCHAR(50) 
ojocation CHAR(2) 
o desc TEXT 



SQL + XPath queries 

-- Local people in each office 
SELECT 

xpath_array ( 
ds_data, 

' / /person [@citizenship="' 

| | ojocation | | ']"'), 

FROM 

datasheet 

JOIN 

office ON xpath_number (ds_data, 

Vaffice/@id') = o id 



SQL/XML publishing + XPath functions 

— Full set of docs available for each office 
SELECT 

XMLELEMENT ( 

NAME "office", 

XMLATTRIBUTES (o_name AS "name"), 
XML CON CAT ( 

XML FORREST ( 



AS 



cati 



o_desc AS "description"), 
XMLELEMENT ( 

NAME "docs", 
XMLAGG{ds_data) ) ) } 
FROM office, datasheet 

WHERE xpath_number (ds_data, ' /of f ice/@id' } 
GROUP BY o id 



Figure 1. Incorporation of XPath in SQL queries 



2. Implementation 

Usage of mature and freely available DBMS engine as a backend for the query 
interface is of a big importance for further development of working prototypes 
implementing Characterisation DM, since relational database server can solve 
many data structure and manipulation problems one may encounter when de- 
ploying data archive in a consistent way. 

With the advent of native XML support in relational database engines it 
all become possible. We use open source PostgreSQL DBMS for storing and 
querying characterisation metadata. 

Initial XML type support in the PostgreSQL has recently been implemented 
by NS in a frame of " Google Summer of Code 2006" . We make heavy use of this 
feature in order to provide comprehensive search over Characterisation metadata 
tree. 

FigureQ]demonstrates the concepts used to store and query XML structures, 
and shows how XPath expressions can be incorporated in SQL queries. 
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We built a prototype of the Characterisation metadata query service, im- 
plementing two access methods: 

1. HTTP-GET/POST based interface implements almost direct translation 
of the query parameter name into XPath of the data model element, then 
incorporated into extended SQL/XML queries to XML and relational data 
being stored in DBMS. This concept allows to distinguish axes by some 
specific properties, for instance by their UCDs, and provides ability of 
putting constraints on the query result. 

2. Web-Service based interface to receive XQuery which is also directly trans- 
lated into SQL/XML statements. This method aims in dealing with 
ADQL-like queries in the future. 

3. Perspectives 

Presently XML type in PostgreSQL is developed for manipulation abilities rather 
than a specific storage engine, meaning that it is built on top of VARCHAR as 
initial storage implicit type. This somehow limits performance of queries since 
one is confined only to use functional indices based on result of XPath expression 
evaluation. 

Nearest plans of XML type development include very important features 
that will comprise: 

• XMLQUERY - standardized by SQL:2006 way of integration of XQuery 
capabilities with other essential relational functionality. Joins of XML and 
relational data in one expression. 

• Proper design at physical level (data structure, comprehensive indices sys- 
tem and Generalized Search Tree usage, etc). This will lead to fast XQuery 
and XPath evaluation. 

Thus, all performance penalties for this combination of relational and XML 
approaches will soon be minimized dramatically. 

This opens very broad perspectives for building similar systems making use 
of some XML-based data models since one can achieve all that flexibility and 
comprehensiveness right inside DBMS. 
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